On the Convergence of Optimistic Policy Iteration
نویسنده
چکیده
We consider a finite-state Markov decision problem and establish the convergence of a special case of optimistic policy iteration that involves Monte Carlo estimation of Q-values, in conjunction with greedy policy selection. We provide convergence results for a number of algorithmic variations, including one that involves temporal difference learning (bootstrapping) instead of Monte Carlo estimation. We also indicate some extensions that either fail or are unlikely to go through.
منابع مشابه
On The Convergence Of Modified Noor Iteration For Nearly Lipschitzian Maps In Real Banach Spaces
In this paper, we obtained the convergence of modified Noor iterative scheme for nearly Lipschitzian maps in real Banach spaces. Our results contribute to the literature in this area of re- search.
متن کاملWeighted Sup-Norm Contractions in Dynamic Programming: A Review and Some New Applications
We consider a class of generalized dynamic programming models based on weighted sup-norm contractions. We provide an analysis that parallels the one available for discounted MDP and for generalized models based on unweighted sup-norm contractions. In particular, we discuss the main properties and associated algorithms of these models, including value iteration, policy iteration, and their optim...
متن کاملOn the Ishikawa iteration process in CAT(0) spaces
In this paper, several $Delta$ and strong convergence theorems are established for the Ishikawa iterations for nonexpansive mappings in the framework of CAT(0) spaces. Our results extend and improve the corresponding results
متن کاملApproximate Policy Iteration: A Survey and Some New Methods
We consider the classical policy iteration method of dynamic programming (DP), where approximations and simulation are used to deal with the curse of dimensionality. We survey a number of issues: convergence and rate of convergence of approximate policy evaluation methods, singularity and susceptibility to simulation noise of policy evaluation, exploration issues, constrained and enhanced polic...
متن کاملConvergence of the multistage variational iteration method for solving a general system of ordinary differential equations
In this paper, the multistage variational iteration method is implemented to solve a general form of the system of first-order differential equations. The convergence of the proposed method is given. To illustrate the proposed method, it is applied to a model for HIV infection of CD4+ T cells and the numerical results are compared with those of a recently proposed method.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 3 شماره
صفحات -
تاریخ انتشار 2002